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ABSTRACT 

The purpose of this study was to investigate the 
effects of hypothesis testing instructions compared to brief 
instructions on the speed of shift problem solution of grade school 
age subjects, in order to provide information on the development of 
hypothesis testing behavior in children and the sampling 
characteristics of hypothesis testing in these younger subjects. A 
second purpose was to apply models of analysis suggested by 
quantitative models of concept Identification to the data from a 
traditional shift study With children, in order to provide some 
information regarding the processes which result in concept 
acquisition. Comparison of post-shift solution rates revealed that 
the effect of :he detailed instructions was to decrease the 
difference between reversal (R) and non-reversal (NR) shifts in the 
predicted direction, and S shifts were still solved more quickly than 
NB shifts. Further analyses revealed that the data could not be 
accounted for by no— memory hypothesis sampling models. It is 
suggested that current developmental theories which can account for 
the relatiye difficulty of R and NR shifts be elaborated to 
quantitative models of children's concept learning. (Author /T A) 
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INSTRUCTIONS AND PROBLEM SHIFTS: 

THEIR IMPLICATIONS FOR THEORY IN CONCEPT LEARNING, 

Recent investigations (Erickson j 1971 j Erickson, Block, and Rulon, 1970) 
of college age acquisition of reversal and extradimensionsl shifts 
revealed that the relative difficulty of solving these problems was 
a functio.i of the instructions given to regarding the nature of the 
task. Specifically, if the instructions were very brief, simply telling 
Ss to discover some systematic relationship among the stimuli, then 
the traditional shift relationship was obtained, that is, reversal 
shifts were much easier to solve than extradimensionsl shifts. However, 
v?hen ^ were given very explicit and deta^alad instructions regarding 
the task, pointing out the dimensions of stimuli and explaining 

the nature of the rule for stimulus classification, then the relationship 
between the shift problems were reversed Since the data from many 
concept identification experiments, in which great care is taken to in- 
sure that understand the nature of the task, can be accounted for 
by hypothesis sampling models of concept identifi*"'' -xon (Bower and 
Trabasso, 1964), the results were interpreted in this context. It was 
suggested that when Ss re— sample- from the pool of hypotheses after 
an error trial, that they tend to sample hypotheses from dimensions 
other than the dimension on which their most recent hypothesis was 
based • 

The purpose of this study is to investigate the effects of hypotheses 
testing instructions compared to brief instructions (modeled after the 
instructions used by the Kcndlers [ Kendler and Kendler , 1959 i Kendler , 
Kendler, and Wells, I960]), on the speed of shift problem solution 
of much younger subjects, in order to provide information on the 



development of hypothesis testing behavior in chilciren and the sampling 

characteristics of hypothesis testing in these younger Ss « A 

second purpose of the stuc’y is to apply methods of analysis suggested 

by quantitative models of Concept Identification (Suppes and Ginsberg, 1963), 

to the data from a traditional shift study vith children, in order to 

provide sotie information regarding the processes which result in concept 

acquisition. 

Method 

Nir 3 ty“six randomly selected children from two age groups at a 
local elementary school in Pittsburgh were ^ /n the experiment. The 
two ages represented were 7 and 8 year old,^ which were referred to as 
the young age group; and 10, 11, 12 year old Ss who were referred to 
as the £ld age group. The mean age of the young group was 7 years - 11 
months and for the bid group was 11 years - 7 months. Within these two 
age groups, half the ^ were assigned to a brief instruction condition 
and half to a detailed, hypothesis testing instruction condition. Within 
each age by instruction condition half the _S_s received a reversal shift 
and half an extradimensional shift. The relevant stimulus dimension that 
a S received was completely counterbalenced across ^ in each cell. 

The design was completely randomized with 4 factors (Age x Instruction 
X Shift X Dimension) with two levels of each factor. There were six 
Ss per cell. 

Tlie experiment required that all ^ be given 1 hour to solve the 
first problem. However, due to computer scheduling, this sometimes was 
not the case. 
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The final data base consisted of the first 94 ^ run under the 
appropriate conditions who solved both problems and an additional -wo 
Ss who were the first Ss to solve the first problem in their cell 
assignment. 

The stimuli were squares and circles colored red with either the 
top half colored white or the bottom half white es shown on page 1 of 
your handout. The four stimuli were photographed and mounted on slides. 



Slide selection was computer controlled and the slides were back- 
projected onto a touch-sensitive screen. During every stimulus pre- 
sentation 2 spots labeled "A" and "B" appeared below the stimulus. A 
touch applied to either spot "A" or "b" with sufficient pressure ac- 
tivated a feedback tone and told the ^ the computer received his re- 
sponse. For correct responses feedback was provided by the sounding 
of a second tone, a flashing light, and a bead was dispensed into a 
clear plastic cup positioned in front of the ^ below the screen. 
beads could be exchanged for toys at the end of the experimental session. 
For incorrect responses, no more events occurred within the rest of the 
interval. 



The four stimulus slides were presented in a random order in blocks 
of 4 subject to the restrictions that no slide could be presented twice 
in a row and that all 4 slides would be presented before the next 
block of trials. The sequence of events during a typical trial was as 
follows: (1) slide on; (2) ^ response with response feedback tone 

sounded concurrently; (3) (for correct responses only) - 2 seconds after 
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the S's response the second feedback tone sounded, the light flashed 
and the bead was dispensed; (4) slide off. The stimulus remained on 
the screen until either 2 seconds after an incorrect response or 2 
seconds after feedback on a correct response. The intertrial interval 

was 10 seconds. 

All Ss were tested individually. The experimenter conducted each 
S into the experimental room and acquainted him with the apparatus. The 
experimenter explained the procedure to be used with a demonstration 
slide that had a blue triangle for the stimulus. The ^ was then read 
either a brief or detailed set of standardized instructions depending 
on his assigned condition. In general, the problem was presented as 
a labeling game in which Ss had to decide which stimuli were called "A" 
and which were called "B". The brief instructions were much like those 
of the Kendlers*. ^ were told: they would receive a bead for every 

correct response; thev ^ one choue c B; , tvery tt-: j-j 

they should look at the figure when they responded; and they should try 
to get all correct responses in a row. Additionally ^ in the retailed 
instructifc-n condition received information concerning the stitu-ilud 
dimensions, and the values of the dimensions and the nature of ti 3 
possible solutions. They were shown the set of the 4 stimuli, the 
differences were pointed out, and the rules were illustrated. Then 
Ss named the dimensions for the experimenter; if they expresse"? diffi- 
culty, they- were prompted. They were told that on^ of severa^ possible 
rules would govern the labeling of the stimuli and that they inad to 
figure out which rule was being used. They would know that they had 





chosen the correct rule by getting all correct answers in a row. The 
experimenter then started the problem and left the room. ^ were 
terminated upon solving both problems or after one hour, whichever came 
first. ^ solved the fivst problem to a criterion of 10 correct re- 
sponses in a row, then were immediately shifted to the second problem with- 
out any warning. The solution of the second problem had relevant 
dimensions that were either a reversal or a nonreversal of those in 
the first problem. The solution values ("A" and "B') for the second 
problem were selected after the S made a mandatory error on the first 
trial in the second problem - only for the nonreversal shift conditions. 
For the reversal shift the "A" and ''B" values were merely switched. 

^ then solved this second problem to the criterion of 10 successive’- 

correct responses. 

The data from this study were analyzed under two criteria; one, 
a more stringent criterion was 10 consecutive correct responses in a 
row, the criterion used in this experiment to determine when to shift 
the problem solution. The data V7ere also analyzed by applying a less 
stringent and more traditional criterion, that is, by considering a 
problem solved when made 9 correct responses out of a block of 10 
successive responses (Kendler, Kendler and Wells, I960). With these 
two criteria, the major focus of analysis was the choice behavior on 
trials preceding a stringent, or a less stringent criterion. In almost 
all cases, the results of the analyses are similar for the two 
criteria with the exception of the stationarity analyses to be pre- 



sented later • 
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Analyses of variance were performed on trials and errors to 
criterion in order to assess the effects of the four major variables. 
The results of the analyses were quite similar for errors and trials, 
and were similar under the application of either the stringent or the 
less stringent criteria-. The ma^or result of these analyses for the 
preshif t problem was the fact that there were no main effects of age, 
instruction, relevant dimension, and shift; and that there were no 
strong two-way interactions among these variables, nor any four-way 
interaction. There was, however, a three-way interaction present 
in each analyses between shift, dimension, and age. Tables I and II 
in your handout shows the character of this interaction for errors and 
trials to criterion. As can be seen, the random assignment procedure 
was not successful in ensuring equivalent speed of learning for the 
two shift groups on' the first problam, since reversal S£ solved the 
first problem faster. Further analyses of the interaction were done 
to determine if significant preshift differences existed for all 
combinations of shift and age; these analyses revealed that comparisons 
of speed of problem solution under reversal and extradiraensional shifts 
could not unequivocally be made. The shift comparisons are confounded 
with the effects of either differential salience or speed of preshift 
solution which is in the same direction as the shift results, or both. 
Thus, shift comparisons on the basis of errors or trials to criterion 

i 

obtained in this study cannot appropriately be made to the classic 

body of literature and theory of discrimination shifts, which 

requires that shift comparisons be unconfounded with the effects 

of dimensional dominance and pre'shift acquisition rate (Kendler and 
Kendler, 1962, 1968; Wolff, 1967). However, within the context 



of an hypothesis theory, which assumes that a dimension is sampled after 
every error trial and the design of the current study, the differential 
salience of the dimensions does not rule out the relevance of learning 
rate comparisons between preshift and postshift problems. Also, differ** 
ences in preshift problem solving for the two shift groups do not in- 
validate the usefulness of information from postshift rate comparisons 
since, within the class of no-"memory hypothesis sampling theories in 
v;hich errors function as recurrent events, there is presumably no cor- 
relation between preshift and postshift problem solving rates (Bower 
and Trabasso, 1964). 

Analyses of postshift performance revealed that both shift type 
and age significantly affected postshift performance with reversal shifts 
solved in fewer errors than estradimensional shifts, and younger Ss 
revealing more errors to solution than older Ss^. Some interesting 
interactions were also observed and can be seen in Figures 2 and 3 of 
your handout. Figure 2 reveals that for the younger there is a 
very large difference in trials to criterion between the reversal shifts 
and the extradimenslonal shift, while this difference, although still 
significant, decreases with older Ss . The effect of instructions on 
shift behavior can be seen in Figure 3. The detailed instructions 
succeeded in reducing the differences between shifts, but the shifts 
were still significantly different for both instructional conditions. 
Thus, relative difficulty of shift type was maintained for both 
instructional conditions, but the size of the difference was reduced 
by the explicit instructions. The nature of the reduction was of the 
SEime form at both age levels, as can be seen in Figure 4 and 5 and was 



?6V6sX6cl in ih 0 stntisticsl annlysis ns n thiT66 wny—intGirnction l) 0 tiv£ 6 n 
Instructions, shift and ags that was not significant* In sunimai^y , then, 
for the age groups studied herein, reversals and extradlniensional shifts 
were significantly different, with an extradimensional shift being more 
difficult. Also, for a group of children ranging in age from 6 to 
12 years old, more detailed, hypothesis testing instructions can reduce 
the relative difficulty of the two shifts. 



The choice data from the preshift problem trials before the 
criterion run were analyzed for stationarity (Bower and Trabasso, 1964; 
Suppes and Ginsberg, 1963), and the results are presented in Figures 
6 and 7. When trials to the stringent criterion are analyzed the de- 
tailed instructions led to significant departures from stationarity for 
both age groups, in contrast to the results usually observed with 
college The brief instructions led to stationary responding for 

both age groups. When the trials before a less stringent criterion 
are analyzed, the younger subjects receiving detailed instrucitons ex- 
hibited stationary responding before the trial of the last error. It 
ir clear, then, that instructions affect the processes of concept ac- 
quisition but do not necessarily move it toward hypothesis testing 
accounts that predict stationary presolution responding. 

Additional analyses of distributions of total errors to criterion 
for the two age groups solving under the detailed instructions pro- 
vided further verification of this suggestion. Figures 8 and 9 show 
the observed distributions of total errors compared to the predictions 
of a no-memory process model in which the probability of sampling the 
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correct hypothesis is 1/4. The data are also compared to the predicted 
error distributions from the Bower— Trabasso model. Both predictions 
do not compare well with the data and it is probably doubtful that any 
models predicting geometric distributions of errors could account for 
these preshift data. 

Predictions of various statistics from the Bower -Trabasso model 
were also generrtad for these data and are compared to the data in 
cables III and IV. The closer fits usually obtained with this model to 
college S s data are not obtained here. The usual statistical tests 
associated with this and other hypothesis sampling models also revealed 
that the model does not fit thk. data well. Thus» the suggestion from 
these data is that the experimental conditions under which hypothesis 
testing behavior of the kind described by some current models of college 
S' s behavior do not provide for the same behavior in children. 

Because the learning rate in the preshift problem was too slow to 
compare well even to a no— memory model, then relative differences in 
shift difficulty will not be accounted for by adding memory processes 
of the form suggested by current models of adult hypothesis sampling 
behavior. Thus, probably the best theoretical direction in which to 
turn would Involve a translation and elaboration of extant develop- 
mental theories cf concept learning that can account for the shift results 
observed herein, into quantitative models. The models will make ex 
tensive and detailed predictions about aspects of the data other than 
over-all shift comparisons; and explicit assumptions about the nature 
of the processes Involved in childrens' concept learning. Quantitative 
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models which provide a good account of the data are basic to the 
solution of instructional optimization problems, and as such are a 
useful theoretical direction to pursue. 
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Figure 1: Stimuli Used In The Experiment. 
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Figure 2: Mean Trials To Criterion On Postshift Problem For 

Young And Old Ss (Stringent Cr^erion). 
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Figure 3: Mean Trials To Criterion On Postshift For Each 

Instruction Condition And Type Of Shift, Data 
Generated Under Less Stringent Criterion. 
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Figure 4: Mean Errors To Less Stringent Criterion On 

Postshift Problem For The Various Groups. 
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MEAN TRAILS TO CRITERION (STRINGENT) 
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Figure 5: Mean Trials To Stringent Criterion On Postshift 

Problem For The Various Groups. 
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Figure 6: P(c) Before The T.L.E. Vincentized Into 4 Parts. 

Preshift Data With Stringent Criterion. 




18 



P(c) BEFORE T.LE 



1.00 

.90 — 

.80 

.70 

.60 
.50 
.40 

.30 

.20 

.10 — 

1 1 r 

12 3 4 





Q Old, Detailed 

^ Old, Brief 

^ Yoi^ng, Brief 

/\ Young, Detailed 



VINCENT QUARTILES 



1 



T.L.E. 



Fiqure 7: P{c) Before The T.L.E. Vincentized Into 4 Parts. 

Preshift Data With Less Stringent Criterion. 
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CUMULATIVE PROPORTION 
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Figure 8: Cumulative Proportion Of Errors To Criterion For Young Ss, Detailed 

Instructions, Preshift Problem Under Less Stringent Criterion. 
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CUMULATIVE PROPORTION 
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Figure 9: Cumulative Proportion Of Errors To Criterion For Old Ss, Detailed 

Instructions, Preshift Problem Under Less Stringent Criterion. 
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TABLE 1: Mean Trials To Criterion For The Various.. 
Groups On Preshift Problem. 
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STATISTIC 


OBSERVED 


PREDICTED 


Mean Total Number Errors (T) 






VAR(T) 


180.76 


83.01 


Trial Number Last Error (N) 






E(N) 


18.58 


17.88 


VAR(N) 


654.51 


302.04 


Number Successes Before Last Error (Z) 






E(Z) 


8.58 


8.26 


\ AR(Z) 


161.56 


76.51 


Number Successes Between K, K + 1 Errors (H) 






E(H) 


.93 


.85 


VAR(H) 


1.62 


1.59 


K = 0 


.52 


.53 


K - 1 


.21 


.24 


K - 2 


.15 


.11 


Number Errors Between K, K + 1 Successes 






K = 0 






E(JK) 


1.08 


.93 


VAR(JK) 


3.21 


1.79 


K = 1 




.83 


E{JK) 


.41 


VAR(JK) 


.77 


1.68 


K = 2 






E(JK) 


.45 


.74 


VAR(JK) 


.52 


1.56 


Mean Number Alternations 


9.54 


9.42 


Mean Number Error Runs of Any Length 


4.58 


4.98 


Mean Number Error Runs of Length K 






K == 1 


2.79 


2.58 


K = 2 


1.00 


1.24 


K - 3 


.70 


.60 




TABLE 2: Data From Young Ss, Detailed Instructions Condition, Preshift 

Problem Compared To Predictions From Bower-Trabasso Model. 
Data Generated Under The Less Stringent Criterion. 



STATISTIC 


OBSERVED 


PREDICTED 


Mean Total Number Errors (T) 






VAR(T) 


123.20 


50.51 


Trial Number Last Error (N) 






E(N) 


15.17 


14.84 


VAR(N) 


529.97 


205.45 


Number Successes Before Last Error (Z) 






E(Z) 


7.37 


7.21 


VAR(Z) 


147.46 


59.30 


Number Successes Between K, K + 1 Errors (H) 






E(H) 


.99 


.94 


VAR(H) 


2.12 


1.84 


o 

(I 


.53 


.51 


K = 1 


.24 


.24 


K = 2 


.07 


.12 


Number Errors Between K, K + 1 Successes 
K = 0 






E{JK) 


1.04 


.82 


VAR(JK) 


2.47 


1.46 


K = 1 




.72 


E(JK) 


.63 


VAR(JK) 


.94 


1.36 


K = 2 




.72 


E(JK) 


.63 


VAR(JK) 


.94 


1.36 


Mean Number Alternations 


7.67 


7.92 


Mean Number Error Runs of Any Length 


3.58 


4.22 


Mean Number Error Runs of Length K 






K = 1 


2.00 


2.33 


K = 2 


1.16 


1.04 


K = 3 


.66 


.46 



TABLE 3: Data From Old Ss, Detailed Instructions Condition, Preshift 

Problem Compared To Predictions From Bower-Trabasso Model. 
The Data Generated Under The Stringent Criterion. 
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